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Abstract 

We study the verification problem in distributed networks, stated as follows. Let H he a 
subgraph of a network G where each vertex of G knows which edges incident on it are in H. We 
would like to verify whether H has some properties, e.g., if it is a tree or if it is connected (every 
node knows at the end of the process whether H has the specified property or not). We would 
like to perform this verification in a decentralized fashion via a distributed algorithm. The time 
complexity of verification is measured as the number of rounds of distributed communication. 

In this paper we initiate a systematic study of distributed verification, and give almost tight 
lower bounds on the running time of distributed verification algorithms for many fundamental 
problems such as connectivity, spanning connected subgraph, and s — t cut verification. We 
then show applications of these results in deriving strong unconditional time lower bounds on 
the hardness of distributed approximation for many classical optimization problems including 
minimum spanning tree, shortest paths, and minimum cut. Many of these results are the first 
non-trivial lower bounds for both exact and approximate distributed computation and they 
resolve previous open questions. Moreover, our unconditional lower bound of approximating 
minimum spanning tree (MST) subsumes and improves upon the previous hardness of approxi- 
mation bound of Elkin [STOC 2004] as well as the lower bound for (exact) MST computation 
of Peleg and Rubinovich [FOCS 1999]. Our result implies that there can be no distributed 
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approximation algorithm for MST that is significantly faster than the current exact algorithm, 
for any approximation factor. 

Our lower bound proofs show an interesting connection between communication complexity 
and distributed computing which turns out to be useful in establishing the time complexity of 
exact and approximate distributed computation of many problems. 

1 Introduction 

Large and complex networks, such as the human society, the Internet, or the brain, are being studied 
intensely by different branches of science. Each individual node in such a network can directly 
communicate only with its neighboring nodes. Despite being restricted to such local communication, 
the network itself should work towards a global goal, i.e., it should organize itself, or deliver a service. 

In this work we investigate the possibilities and limitations of distributed/decen-tralized com- 
putation, i.e., to what degree local information is sufficient to solve global tasks. Many tasks can 
be solved entirely via local communication, for instance, how many friends of friends one has. Re- 
search in the last 30 years has shown that some classic combinatorial optimization problems such 
as matching, coloring, dominating set, or approximations thereof can be solved using small (i.e., 
polylogarithmic) local communication. For example, a maximal independent set can be computed 
in time O(logn) [25], but not in time 0(y^log n/ log log n) [18j (n is the network size). This lower 
bound even holds if message sizes are unbounded. 

However many important optimization problems are "global" problems from the distributed 
computation point of view. To count the total number of nodes, to determining the diameter of the 
system, or to compute a spanning tree, information necessarily must travel to the farthest nodes 
in a system. If exchanging a message over a single edge costs one time unit, one needs Vl{D) time 
units to compute the result, where D is the network diameter. If message size was unbounded, 
one can simply collect all the information in 0(D) time, and then compute the result. Hence, 
in order to arrive at a realistic problem, we need to introduce communication limits, i.e., each 
node can exchange messages with each of its neighbors in each step of a synchronous system, but 
each message can have at most B bits (typically B is small, say O(logn)). However, to compute 
a spanning tree, even single-bit messages are enough, as one can simply breadth-first-search the 
graph in time 0{D) and this is optimal [28] . 

But, can we verify whether an existing subgraph that is claimed to be a spanning tree indeed 
is a correct spanning tree?! In this paper we show that this is not generally possible in 0(D) time 
- instead one needs i}(^/n + D) time. (Thus, in contrast to traditional non-distributed complexity, 
verification is harder than computation in the distributed world!). Our paper is more general, 
as we show interesting lower and upper bounds (these are almost tight) for a whole selection 
of verification problems. Furthermore, we show a key application of studying such verification 
problems to proving strong unconditional time lower bounds on exact and approximate distributed 
computation for many classical problems. 

1.1 Technical Background and Previous Work 

Distributed Computing Consider a synchronous network of processors with unbounded com- 
putational power. The network is modeled by an undirected n-vertex graph, where vertices model 
the processors and edges model the links between the processors. The processors (henceforth, ver- 
tices) communicate by exchanging messages via the links (henceforth, edges). The vertices have 
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limited global knowledge, in particular, each of them has its own local perspective of the network 
(a.k.a graph), which is confined to its immediate neighborhood. The vertices may have to compute 
(cooperatively) some global function of the graph, such as a spanning tree (ST) or a minimum span- 
ning tree (MST), via communicating with each other and running a distributed algorithm designed 
for the task at hand. There are several measures to analyze the performance of such algorithms, a 
fundamental one being the running time, defined as the worst-case number of rounds of distributed 
communication. This measure naturally gives rise to a complexity measure of problems, called the 
time complexity. On each round at most B bits can be sent through each edge in each direction, 
where B is the bandwidth parameter of the network. The design of efficient algorithms for this 
model (henceforth, the B model), as well as establishing lower bounds on the time complexity 
of various fundamental graph problems, has been the subject of an active area of research called 
(locality-sensitive) distributed computing (see [28] and references therein.) 

Distributed Algorithms, Approximation, and Hardness Much of the initial research focus 
in the area of distributed computing was on designing algorithms for solving problems exactly, e.g., 
distributed algorithms for ST, MST, and shortest paths are well-known [28} I26j . Over the last few 
years, there has been interest in designing distributed algorithms that provide approximate solutions 
to problems. This area is known as distributed approximation. One motivation for designing such 
algorithms is that they can run faster or have better communication complexity albeit at the cost 
of providing suboptimal solution. This can be especially appealing for resource-constrained and 
dynamic networks (such as sensor or peer-to-peer networks). For example, there is not much point 
in having an optimal algorithm in a dynamic network if it takes too much time, since the topology 
could have changed by that time. For this reason, in the distributed context, such algorithms 
are well- motivated even for network optimization problems that are not NP-hard, e.g., minimum 
spanning tree, shortest paths etc. There is a large body of work on distributed approximation 
algorithms for various classical graph optimization problems (e.g., see the surveys by Elkin [7] and 
Dubhashi et al. [6j, and the work of [15] and the references therein). 

While a lot of progress has been made in the design of distributed approximation algorithms, the 
same has not been the case with the theory of lower bounds on the approximability of distributed 
problems, i.e., hardness of distributed approximation. There are some inapproximability results that 
are based on lower bounds on the time complexity of the exact solution of certain problems and 
on integrality of the objective functions of these problems. For example, a fundamental result due 
to Linial [23] says that 3-coloring an n-vertex ring requires f2(log*n) time. In particular, it implies 
that any 3/2-approximation protocol for the vertex-coloring problem requires 0,(\og* n) time. On 
the other hand, one can state inapproximability results assuming that vertices are computationally 
limited; under this assumption, any NP-hardness inapproximability result immediately implies an 
analogous result in the distributed model. However, the above results are not interesting in the 
distributed setting, as they provide no new insights on the roles of locality and communication |10j . 

There are but a few significant results currently known on the hardness of distributed ap- 
proximation. Perhaps the first important result was presented for the MST problem by Elkin 
in [To]. Specifically, he showed strong unconditional lower bounds (i.e., ones that do not depend 
on complexity-theoretic assumptions) for distributed approximate MST (more on this result be- 
low). Later, Kuhn, Moscibroda, and Wattenhofer [18] showed lower bounds on time approximation 
trade-offs for several problems. 
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1.2 Distributed Verification 



The above discussion summarized two major research aspects in distributed computing, namely 
studying distributed algorithms and lower bounds for (1) exact and (2) approximate solutions to 
various problems. The third aspect — that turns out to have remarkable applications to the first 
two — called distributed verification, is the main subject of the current paper. In distributed 
verification, we want to efficiently check whether a given subgraph of a network has a specified 
property via a distributed algorithrr0. Formally, given a graph G = {V, E), a subgraph H = {V, E') 
with E' C E, and a predicate H, it is required to decide whether H satisfies 11 (i.e., when the 
algorithm terminates, every node knows whether H satisfies 11). The predicate 11 may specify 
statements such as "-fT is connected" or "if is a spanning tree" or contains a cycle". (Each 
vertex in G knows which of its incident edges (if any) belong to H.) The goal is to study bounds on 
the time complexity of distributed verification. The time complexity of the verification algorithm is 
measured with respect to parameters of G (in particular, its size n and diameter D), independently 
from H. 

We note that verification is different from construction problems, which have been the traditional 
focus in distributed computing. Indeed, distributed algorithms for constructing spanning trees, 
shortest paths, and other problems have been well studied (|281 126j). However, the corresponding 
verification problems have received much less attention. To the best of our knowledge, the only 
distributed verification problem that has received some attention is the MST (i.e., verifying if H 
is a MST); the recent work of Kor et al. [16] gives a Vl{y/n/B + D) deterministic lower bound on 
distributed verification of MST, where D is the diameter of the network G. That paper also gives 
a matching upper bound (see also [IZ]). Note that distributed construction of MST has rather 
similar lower and upper bounds [29^ I12j . Thus in the case of the MST problem, verification and 
construction have the same time complexity. We later show that the above result of Kor et al. is 
subsumed by the results of this paper, as we show that verifying any spanning tree takes so much 
time. 

Motivations The study of distributed verification has two main motivations. The first is under- 
standing the complexity of verification versus construction. This is obviously a central question in 
the traditional RAM model, but here we want to focus on the same question in the distributed 
model. Unlike in the centralized setting, it turns out that verification is not in general easier than 
construction in the distributed setting! In fact, as was indicated earlier, distributively verifying a 
spanning tree turns out to be harder than constructing it in the worst case. Thus understanding the 
complexity of verification in the distributed model is also important. Second, from an algorithmic 
point of view, for some problems, understanding the verification problem can help in solving the 
construction problem or showing the inherent limitations in obtaining an efficient algorithm. In 
addition to these, there is yet another motivation that emerges from this work: We show that dis- 
tributed verification leads to showing strong unconditional lower hounds on distributed computation 
(both exact and approximate) for a variety of problems, many hitherto unknown. For example, we 
show that establishing a lower bound on the spanning connected subgraph verification problem 
leads to establishing lower bounds for the minimum spanning tree, shortest path tree, minimum 
cut etc. Hence, studying verification problems may lead to proving hardness of approximation as 
well as lower bounds for exact computation for new problems. 

^Such problems have been studied in the sequential setting, e.g., Tarjan [32] studied verification of MST. 
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1.3 Our Contributions 



In this paper, our main contributions are twofold. First, we initiate a systematic study of dis- 
tributed verification, and give almost tight uniform lower bounds on the running time of distributed 
verification algorithms for many fundamental problems. Second, we make progress in establishing 
strong hardness results on the distributed approximation of many classical optimization problems. 
Our lower bounds also apply seamlessly to exact algorithms. We next state our main results (the 
precise theorem statements are in the respective sections as mentioned below). 



1. Distributed Verification We show a lower bound of 0(y^n/(i?logn) + D) for many veri- 
fication problems in the B model, including spanning connected subgraph, s-t connectivity, cycle- 
containment, bipartiteness, cut, least-element list, and s — t cut (cf. definitions in Section [5]). These 
bounds apply to Monte Carlo randomized algorithms as well and clearly hold also for asynchronous 
networks. Moreover, it is important to note that our lower bounds apply even to graphs of small 
diameter {D = O(logn)). Furthermore we present slightly weaker lower bounds for even smaller 
(constant) diameters. (Indeed, the problems studied in this paper are "global" problems, i.e., the 
network diameter of G imposes an inherent lower bound on the time complexity.) 

Additionally, we show that another fundamental problem, namely, the spanning tree verification 
problem (i.e., verifying whether H isa spanning tree) has the same lower bound of J7(y^n/(i?logn) + 
D) (cf. Section [6]). However, this bound applies to only deterministic algorithms. This result 
strengthens the deterministic lower bound result of minimum spanning tree verification by Kor et 
al. [16j in that it shows that the same lower bound holds even on the simpler problem of spanning 
tree verification. Moreover, we note the interesting fact that although finding a spanning tree 
(e.g., a breadth-first tree) can be done in 0{D) rounds [28], verifying if a given subgraph is a 
spanning tree requires ^l{^/n + D) rounds! Thus the verification problem for spanning trees is 
harder than its construction in the distributed setting. This is in contrast to this well-studied 
problem in the centralized setting. Apart from the spanning tree verification problem, we also 
show deterministic lower bounds for other verification problems, including Hamiltonian cycle and 
simple path verification. 

Our lower bounds are almost tight as we show that there exist algorithms that run in 0{y/n\og* n+ 
D) rounds (assuming B = O(logn)) for almost all the verification problems addressed here (cf. Sec- 
tion E]). 

2. Bounds on Hardness of Distributed Approximation An important consequence of our 
verification lower bound is that it leads to lower bounds for exact and approximate distributed 
computation. We show the unconditional time lower bound of ^{^Jn/ [B\ogn) + D) for approx- 
imating many optimization problems, including MST, shortest s — t path, shortest path tree, and 
minimum cut (Section [7]). The important point to note is that the above lower bound applies for 
any approximation ratio a > 1. Thus the same bound holds for exact algorithms as well (that is 
a = 1). All these hardness bounds hold for randomized algorithms. (In fact, these bounds hold 
for Monte Carlo randomized algorithms while previous lower bounds |29t [Tm [21] hold only for Las 
Vegas randomized algorithms.) As in our verification lower bounds, these bounds apply even to 
graphs of small (O(logn)) diameter. Figure [1] summarizes our lower bounds for various diameters. 

Our results improve over previous ones (e.g., Elkin's lower bound for approximate MST and 
shortest path tree [10]) and subsumes some well-established exact bounds (e.g., Peleg and Rubi- 
novich lower bound for MST [29]) as well as show new strong bounds (both for exact and approxi- 
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Figure 1: Lower bounds of randomized a-approximation algorithms on graphs of various diameters. 
Bounds in the first cohimn are for the MST and shortest path tree problems [TU] while those in the 
second column are for these problems and many problems listed in Figure [2j We note that these 
bounds almost match the 0{^/nlog* n + D) upper bound for the MST problem [121 [2T] and are 
independent of the approximation-factor a. Also note a simple observation that lower bounds for 
graphs of diameter D also hold for graphs of larger diameters. 

mate computation) for many other problems (e.g., minimum cut), thus answering some questions 
that were open earlier (see the survey by Elkin [7J). 

The new lower bound for approximating MST simplifies and improves upon the previous 
i^{y^n/(aB log n) + D) lower bound by Elkin [10], where a is the approximation factor. [lOj 
showed a tradeoff between the running time and the approximation ratio of MST. Our result shows 
that approximating MST requires 0(y^n/(i?logn) -|- D) rounds, regardless of a. Thus our result 
shows that there is actually no trade-off, since there can be no distributed approximation algorithm 
for MST that is significantly faster than the current exact algorithm |2H [9]. for any approximation 
factor a > 1. 

1.4 Overview of Technical Approach 

We prove our lower bounds by establishing an interesting connection between communication 
complexity and distributed computing. Our lower bound proofs consider the family of graphs 
evolved through a series of papers in the literature [lOl [Ml [29]. However, while previous results 
[291 [TOl [24l [IB] rely on counting the number of states needed to solve the mailing problem (along 
with some sophisticated techniques for its variant, called corrupted mailing problem, in the case 
of approximation algorithm lower bounds) and use Yao's method [36j (with appropriate input dis- 
tributions) to get lower bounds for randomized algorithms, our results are achieved using a few 
steps of simple reductions, starting from problems in communication complexity, as follows (also 
see Figure [2] for details). 

(Section [3]) First, we reduce the lower bounds of problems in the standard communication 
complexity model [20] to the lower bounds of the equivalent problems in the "distributed version" 
of communication complexity. Specifically, we prove the Simulation Theorem (cf . Section [3]) which 
relates the communication lower bound from the standard communication complexity model \20] to 
compute some appropriately chosen function /, to the distributed time complexity lower bound for 
computing the same function in a specially chosen graph G. In the standard model, Alice and Bob 
can communicate directly (via a bidirectional edge of bandwidth one) . In the distributed model, we 
assume that Alice and Bob are some vertices of G and they together wish to compute the function 
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Figure 2: Problems and reductions between them to obtain randomized and deterministic lower 
bounds. For all problems, we obtain lower bounds as in Figured) In order to get the whole picture 
of the paper, we recommend reading along the black dashed line. Definitions of (Monte Carlo) 
randomized algorithms can be found in Section 12.11 Definitions of problems in communication 
complexity, distributed verification of functions, distributed verification of networks and distributed 
approximation, can be found in Section \T2[ 12.31 and 12.51 respectively. 
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/ using the communication graph G. The choice of graph G is critical. We use a graph called 
G(r,(i,p) (parameterized by F, d and p) that was first used in (lOj . We show a reduction from the 
standard model to the distributed model, the proof of which relies on some observations used in 
previous results (e.g., [29]). 

(Section H]) The connection established in the first step allows us to bypass the state counting 
argument and Yao's method, and reduces our task in proving lower bounds of verification problems 
to merely picking the right function / to reduce from. The function / that is useful in showing our 
randomized lower bounds is the set disjointness function [H [Ml O |3l] , which is the quintessential 
problem in the world of communication complexity with applications to diverse areas and has been 
studied for decades (see a recent survey in [3]). Following a result well known in communication 
complexity [20], we show that the distributed version of this problem has an J7(y^n/(i?logn)) lower 
bound on graphs of small diameter. 

(Section [S]&:[UD We then reduce this problem to the verification problems using simple reductions 
similar to those used in data streams [13]. The set disjointness function yields randomized lower 
bounds and works for many problems (see Figure [2|), but it does not reduce to certain other problems 
such as spanning tree. To show lower bounds for these other problems, we use a different function 
/ called equality function. However, this reduction yields only deterministic lower bounds for the 
corresponding verification problems. 

(Section [T|) Finally, we reduce the verification problem to hardness of distributed approximation 
for a variety of problems to show that the same lower bounds hold for approximation algorithms 
as well. For this, we use a reduction whose idea is similar to one used to prove hardness of 
approximating TSP (Traveling Salesman Problem) on general graphs (see, e.g., [34]): We convert 
a verification problem to an optimization problem by introducing edge weights in such a way that 
there is a large gap between the optimal values for the cases where H satisfies, or does not satisfy 
a certain property. This technique is surprisingly simple, yet yields strong unconditional hardness 
bounds — many hitherto unknown, left open (e.g., minimum cut) [7] and some that improve over 
known ones (e.g., MST and shortest path tree) [10]. As mentioned earlier, our approach shows that 
approximating MST by any factor needs i}{y/n) time, while the previous result due to Elkin gave 
a bound that depends on a (the approximation factor), i.e. Cl{y^n/a), using more sophisticated 
techniques. 

Figure [2] summarizes these reductions that will be proved in this paper. Our proof technique via 
this approach is quite general and conceptually straightforward to apply as it hides all complexities 
in the well studied communication complexity. Yet, it yields tight lower bounds for many prob- 
lems (we show almost matching upper bounds for many problems in Section [8]). It also has some 
advantages over the previous approaches. First, in the previous approach, we have to start from 
scratch every time we want to prove a lower bound for a new problem. For example, extending from 
the mailing problem in [29] to the corrupted mailing problem in [lOl requires some sophisticated 
techniques. Our new technique allows us to use known lower bounds in communication complexity 
to do such a task. Secondly, extending a deterministic lower bound to a randomized one is some- 
times difficult. As in our case, our randomized lower bound of the spanning connected subgraph 
problem would be almost impossible without connecting it to the communication complexity lower 
bound of the set disjointness problem (whose strong randomized lower bound is a result of years 
of studies [H [Ml [2l [31]). One important consequence is that this technique allows us to obtain 
lower bounds for Monte Carlo randomized algorithms while previous lower bounds hold only for 
Las Vegas randomized algorithms. We believe that this technique could lead to many new lower 
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bounds of distributed algorithms. 

Recent results After the prehminary version of this paper ([5j) appeared, the connection between 
communication complexity and distributed algorithm lower bounds has been further used to develop 
some new lower bounds. In [27], the Simulation Theorem (cf. Theorem 13. ip is extended to show 
a connection between bounded-round communication complexity and distributed algorithm lower 
bounds. It is then used to show a tight lower bound for distributed random walk algorithms. In [11], 
lower bounds of computing diameter of a network and related problems are shown by reduction from 
the communication complexity of set disjointness. This is done by considering the communication 
at the bottleneck of the network (sometimes called bisection width \20\ 122] ). A similar argument is 
also used in [T9] to show lower bounds on directed networks. 

2 Preliminaries 

To make it easy to look up for definitions, we collect all necessary definitions in this section. We 
recommend the readers to skip this section in the first read and come back when necessary. 

This section is organized as follows (also see Figure [2] for a pointer to a subsection for each 
definition). In Subsection l2.1l we define the notion of e-error randomized public-coin algorithms and 
the worse-case running time of these algorithms. In Subsection 12.21 we define the communication 
complexity model and the set disjointness and equality problems. We then extend this model to 
the model of distributed verification of functions in Subsection 12.31 In Subsection 12. 4^ we give a 
formal definition of the distributed verification problem which we explained informally in Section [1] 
We also define the specific distributed verification problems considered in this paper. Finally, in 
Subsection 12. 5^ we define the notion of approximation algorithms. 

2.1 Randomized (Monte Carlo) Public-Coin Algorithms and the Worst-Case 
Running Time 

In this paper, we show lower bounds of distributed algorithms that are Monte Carlo. Recall that 
a Monte Carlo algorithm is a randomized algorithm whose output may be incorrect with some 
probability. Formally, let A be any algorithm for computing a function /. We say that A computes 
/ with e-error if for every input x, A outputs f{x) with probability at least 1 — e. Note that a 
0-error algorithm is deterministic. 

We note the fact that lower bounds of Monte Carlo algorithms also imply lower bounds of Las 
Vegas algorithms (whose output is always correct but the running time is only in expectation). 
Thus, lower bounds in this paper hold for both types of algorithms. 

Public coin We say that a randomized distributed algorithm uses a public coin if all nodes have 
an access to a common random string (chosen according to some probability distribution). In this 
paper, we are interested in the lower bounds of public-coin randomized distributed algorithms. We 
note that these lower bounds also imply time lower bounds of private-coin randomized distributed 
algorithms, where nodes do not share a random string, since allowing a public coin only gives more 
power to the algorithms. 
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Worst-case running time For any public-coin randomized distributed algorithm ^ on a network 
G and input X (given to nodes in G), we define the worst-case running time of A on input X to be 
the maximum number of rounds needed to run A among all possible (shared) random strings. The 
worst-case running time of A is the maximum, over all inputs X, of the worst-case running time of 
A on X. 

2.2 Communication Complexity 

In this paper, we consider the standard model of communication complexity. To avoid confusion, 
we define the model as a special case of the distributed algorithm model. We refer to j20j for the 
conventional definition, further details and discussions. 

In this model, there are two nodes in the network connected by an edge. We call one node 
Alice and the other node Boh. Alice and Bob each receive a 6-bit binary string, for some integer 
6 > 1, denoted by x and y respectively. Together, they both want to compute f{x, y) for a Boolean 
function / : {0,1}'' x {0,1}'' — )• {0,1}. In the end of the process, we want both Alice and Bob 
to know the value of f{x,y). We are interested in the worst-case running time of distributed 
algorithms on this network when one bit can be sent on the edge in each round (thus the running 
time is equal to the number of bits Alice and Bob exchange). 

For any Boolean function / and e > 0, we let RT ^"^(/) denote the minimum worst-case running 
time of the best e-error randomized algorithm for computing / in the communication complexity 
model. 

In this paper, we are interested in two Boolean functions, set disjointness (disj) and equality 
(eq) functions, defined as follows. 

• Set Disjointness function (disj). Given two 5-bit strings x and y, the set disjointness 
function, denoted by disj (x, y), is defined to be one if the inner product {x, y) is (i.e., there 
is no i such that Xi = yi = 1) and zero otherwise. 

• Equality function (eq). Given two 6-bit strings x and y, the equality function, denoted by 
eq(x, y), is defined to be one ii x = y and zero otherwise. 

2.3 Distributed Verification of Functions 

We consider the same problem as in the case of communication complexity. That is, Alice and 
Bob receive 6-bit binary strings x and y respectively and they want to compute f{x,y) for some 
Boolean function /. However, Alice and Bob are now distinct vertices in a 5-model distributed 
network G (cf. Section [TTT]). We denote Alice's node (which receives x) by s and Bob's node (which 
receives y) by r. At the end of the process, both s and r will output f{x,y). We are interested in 
the worst-case running time a distributed algorithm needs in order to compute function /. 

For any network G (with two nodes marked as s and r). Boolean function / and e > 0, we let 
R^{f ) denote the worst-case running time of the best e-error randomized distributed algorithm for 
computing / on G. 

In this model, we consider the set disjointness and equality functions as in the communication 
complexity model (cf. Subsection 12. 2p . 
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2.4 Distributed Verification of Networks 

We already gave an informal definition of this problem in Section [TJ We now define the problem 
formally. In the distributed network G, we describe its subgraph H as an input as follows. Each 
node V in G with neighbors ui, . . . where d{v) is the degree of v, has d{v) Boolean indicator 

variables Yy{ui), . . . , Y^^UfK^^-^) indicating which of the edges incident to v participate in the subgraph 
H. The indicator variables must be consistent, i.e., for every edge {u,v), Yy(u) = Yu{v) (this is 
easy to verify locally with a single round of communication). 

Let Hy be the set of edges whose indicator variables are 1; that is, 

HY = {{u,v)eE\Yu{v) = l}. 

Given a predicate 11 (which may specify statements such as "i/y is connected" or "//y is a spanning 
tree" or "-ffy contains a cycle"), the output for a verification problem at each vertex v is an 
assignment to a (Boolean) output variable A" , where = 1 if Hy satisfies the predicate H, and 
= otherwise. 

We say that a distributed algorithm verifies predicate 11 if, for every graph G and subgraph 
Hy of G, all nodes in G knows whether Hy satisfies 11 after we run that is, after the execution 
of on graph G, at each vertex v the output variable A" is one if Hy satisfies predicate H, and 
zero otherwise. Note again that the time complexity of the verification algorithm is measured with 
respect to the size and diameter of G (independently from Hy). When Y is clear from the context, 
we use H to denote Hy. 

We now define problems considered in this paper. 

• connected spanning subgraph verification: We want to verify whether H is connected 
and spans all nodes of G, i.e., every node in G is incident to some edge in H. 

• cycle containment verification: We want to verify if H contains a cycle. 

• e-cycle containment verification: Given an edge e in H (known to vertices adjacent to 
it), we want to verify if H contains a cycle containing e. 

• bipartiteness verification: We want to verify whether H is bipartite. 

• s-t connectivity verification: In addition to G and H, we are given two vertices s and t 
{s and t are known by every vertex). We would like to verify whether s and t are in the same 
connected component of H. 

• connectivity verification: We want to verify whether H is connected. 

• cut verification: We want to verify whether H is a cut of G, i.e., G is not connected when 
we remove edges in H. 

• edge on all paths verification: Given two nodes u, v and an edge e. We want to verify 
whether e lies on all paths between u and v in H. In other words, e is a u-v cut in H. 

• s-t cut verification: We want to verify whether H is an s-t cut, i.e., when we remove 
all edges Eh of H from G, we want to know whether s and t are in the same connected 
component or not. 
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• least-element list verification ^ I15j : The input of this problem is different from other 
problems and is as follows. Given a distinct rank (integer) r{v) to each node v in the weighted 
graph G, for any nodes u and v, we say that v is the least element u v has the lowest 
rank among vertices of distance at most d{u, v) from u. Here, d{u, v) denotes the weighted 
distance between u and v. The Least-Element List (LE-list) of a node u is the set {{v, d{u, v)) \ 
V is the least element of u}. 

In the least-element list verification problem, each vertex knows its rank as an input, and 
some vertex u is given a set = {{vi,d{u,vi)), {v2,d{u,V2)), ■ ■ ■} as an input. We want to 
verify whether 5* is the least-element list of u. 

• Hamiltonian cycle verification: We would like to verify whether H is a Hamiltonian cycle 
of G, i.e., H is a simple cycle of length n. 

• spanning tree verification: We would like to verify whether H is a tree spanning G. 

• simple path verification: We would like to verify that H is a simple path, i.e., all nodes 
have degree either zero or two in H except two nodes that have degree one and there is no 
cycle in H. 

2.5 Approximation Algorithms 

In a graph optimization problem V in a distributed network, such as finding a MST, we are given a 
non- negative weight uj{e) on each edge e of the network (each node knows the weights of all edges 
incident to it). Each pair of network and weight function (G, w) comes with a nonempty set of 
feasible solutions for a problem V; e.g., for the case of finding a MST, all spanning trees of G are 
feasible solutions. The goal of V is to find a feasible solution that minimizes or maximizes the total 
weight. We call such a solution an optimal solution. For example, a spanning tree of minimum 
weight is an optimal solution for the MST problem. 

For any a > 1, an a-approximate solution oiV on weighted network (G,a;) is a feasible solution 
whose weight is not more than a (respectively, 1/a) times of the weight of the optimal solution 
of if P is a minimization (respectively, maximization) problem. We say that an algorithm A 
is an a-approximation algorithm for problem V if it outputs an a-approximate solution for any 
weighted network (G, w). In case of randomized algorithms (cf. Subsection 12. ip . we say that an 
a-approximation T-time algorithm is e-error if it outputs an answer that is not a-approximate 
with probability at most e and always finishes in time T, regardless of the input and the choice of 
random string. 

In this paper, we consider the following problems. 

• In the minimum spanning tree problem |10l [29] , we want to compute the weight of the 
minimum spanning tree (i.e., the spanning tree of minimum weight). In the end of the process 
all nodes should know this weight. 

• Consider a network with two cost functions associated to edges, weight and length, and a 
root node r. For any spanning tree T, the radius of T is the maximum length (defined by 
the length function) between r and any leaf node of T. Given a root node r and the desired 
radius a shallow- light tree [28] is the spanning tree whose radius is at most i and the 
total weight is minimized (among trees of the desired radius). 
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• Given a node s, the s-source distance problem [8] is to find the distance from s to every 
node. In the end of the process, every node knows its distance from s. 

• In the shortest path tree problem |10] . we want to find the shortest path spanning tree 
rooted at some input node s, i.e., the shortest path from s to any node t must have the same 
weight as the unique path from s to t in the solution tree. In the end of the process, each 
node should know which edges incident to it are in the shortest path tree. 

• The minimum routing cost spanning tree problem [35J is defined as follows. We think 
of the weight of an edge as the cost of routing messages through this edge. The routing cost 
between any node u and f in a given spanning tree T, denoted by ct{u,v), is the distance 
between them in T. The routing cost of the tree T itself is the sum over all pairs of vertices of 
the routing cost for the pair in the tree, i.e., "^^vev '^t{u,v). Our goal is to find a spanning 
tree with minimum routing cost. 

• A set of edges E' is a cut of G if G is not connected when we delete E' . The minimum cut 
problem [7| is to find a cut of minimum weight. A set of edges E' is an s-t cut if there is no 
path between s and t when we delete E' from G. The minimum s-t cut problem is to find 
an s-t cut of minimum weight. 

• Given two nodes s and t, the shortest s-t path problem is to find the length of the shortest 
path between s and t. 

• The generalized Steiner forest problem [15] is defined as follows. We are given k disjoint 
subsets of vertices Vi,...,Vk (each node knows which subset it is in). The goal is to find a 
minimum weight subgraph in which each pair of vertices belonging to the same subsets is 
connected. In the end of the process, each node knows which edges incident to it are in the 
solution. 

Note that in the minimum spanning tree, minimum cut, minimum s-t cut, and shortest s- 
t path problems, an a-approximation algorithm should find a solution that has total weight at 
most a times the weight of the optimal solution. For the s-source distance problem, an a- 
approximation algorithm should find an approximate distance d{v) of every vertex v such that 
distance{s, v) < d{v) < a ■ distance{s, v) where distance{s, v) is the distance of s from v. Similarly, 
an a-approximation algorithm for the shortest path tree problem should find a spanning tree T such 
that, for any node v, the length £ of a unique path from s to w in T satisfies i < a - distance{s, v). 

3 From Communication Complexity to Distributed Computing 

In this section, we show a connection between the communication complexity model (cf. Section [2^2]) 
and the model of distributed verification of functions (cf. Section 12. 3p on a family of graphs called 
G{T,d,p). This family of graphs was first defined in [lOj (which was extended from [29j). We will 
define this graph in Subsection 13.11 for completeness. 

The main result of this section shows that if there is a fast e-error algorithm for computing 
/ on G(T,d,p), then there is a fast e-error algorithm for Alice and Bob to compute / in the 
communication complexity model. We call this the Simulation Theorem. We state the theorem 
below. The rest of this section is devoted to define the graph G{T, d,p) and to prove the theorem. 
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Figure 3: An example of G{T,d,p) (here d = 2). 

Theorem 3.1 (Simulation Theorem). For anyV, d, p, B, e> 0, and function f : {0, l}''x{0, 1}'' — t- 
{0,1}, if there is an e-error distributed algorithm on G{T,d,p) that computes f faster than ^^y^ 
time, i.e., 

then there is an e-error algorithm in the communication complexity model that computes f in at 
most 2dpB e!^^^ ''^'^\f) time. In other words, 

Rf-'P''\f ) < 2dpBRf'^^''^'P\f) . 

We first describe the graph G{T, d,p) with parameters F, d and p and distinct vertices s and r. 

3.1 Description of G(r,rf,j9) [10] 

We now describe the network G{T,d,p) in detail. The two basic units in the construction are 
paths and a tree. There are F paths, denoted by V^,V^,. . . , each having # nodes, i.e., for 
£ = 1,2,. ..F, 

V{V') = {v'„ . . .,v',,_,} and E{V') = {{vlvf^,) | < z < - 1} . 

There is a tree, denoted by T having depth p where each non-leaf node has d children (thus, there 
are # leaf nodes). We denote the nodes of T at level £ from left to right by Uq, . . . , u^^e_i (so, Uq is 
the root of T and Uq, . . . , u^p_i are the leaves of T). For any £ and j, the leaf node is connected 
to the corresponding path node Vj by a spoke edge {u^,Vj). Finally, we set the two special nodes 
(which will receive input strings x and y) as s = Ug and r = u^p_^. Figure [3] depicts this network. 
We note the following lemma proved in [lOj . 

Lemma 3.2. JiOj/ The number of vertices in G{T, d,p) is n = @{Td^) and its diameter is 2p + 2. 
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3.2 Terminologies 

For any 1 < i < [{dP — l)/2j , define the i-left and the i-right of path as 

LiiV) = {v'j \j<dP-l-t} and Ri{V') = {v'^ | j > i} , 

respectively. Thus, Lq{V'^) = RoiV^) = V{V'^). Define the i-left of the tree T, denoted by Lj(T), 
as the union of the set S = {n^ \ j ^ — ^ — i} and all ancestors of all vertices in S. Similarly, 
the z-right Ri{T) of the tree T is the union of set S = {u^ | j > i} and all ancestors of all vertices 
in S. Now, the i-left and i-right sets of G{T,d,p) are the union of those left and right sets, 

U = [] Li{V^) U Li{T) and Ri = [_} Ri{V^) U R^{r) . 

e e 

For i = 0, we modify the definition and set Lq = V \ {r} and Rq = V \ {s} . See Figure 13.21 

Let A be any deterministic distributed algorithm run on graph G(r,d,p) for computing a 
function /. Fix any input strings x and y given to s and r respectively. Let (pj({x,y) denote the 
execution of ^ on x and y. Denote the state of the vertex v at the end of round t during the 
execution ^PA{x,y) by crj[{v,t,x,y). 

We note the following important property of distributed algorithms. The state of a vertex v at 
the end of time t is uniquely determined by its input and the sequence of messages on each of its 
incoming links from time 1 to t. Intuitively, this is because a distributed algorithm is simply a set 
of algorithms run on different nodes in a network. The algorithm on each node behaves according 
to its input and the sequence of messages sent to it so far. From this, for example, we can conclude 
that in two different executions and ipj\^{x' ,y'), a vertex reaches the same state at time t 

(i.e., cr_A{v, t, X, y) = (T_4(f , t, x', y')) if and only if it receives the same sequence of messages on each 
of its incoming links. 

For a given set of vertices U = {vi, . . . , Vi} C y, a configuration 

Cj^{U,t,x,y) = {aA{vi,t,x,y), . . . ,aA{ve,t,x,y)) 

is a vector of the states of the vertices of U at the end of round t of the execution (Pa{x, y)- 

3.3 Observations 

We note the following crucial observations developed in [291 \T0\ [Mj 116] . We will need Lemma 13.41 
to prove Theorem 13.11 in the next subsection. 

Observation 3.3. For any set U ^ U' , Ca{U, t, x, y) can be uniquely determined by Ca{U' , t— 
1, X, y) and all messages sent to U from V \ U' at time t. 

Proof. Recall that the state of each vertex v inU can be uniquely determined by its state o"^(t', t — 
l,x,y) at time t — 1 and the messages sent to it at time t. Moreover, the messages sent to v 
from vertices inside U' can be determined by Ca{U' ,t — l,x,y). Thus if the messages sent from 
vertices in V \U' are given then we can determine all messages sent to U at time t and thus we 
can determine C^(f/, t, x,y). □ 

From now on, to simplify notations, when A, x and y are clear from the context, we use CL^ 
and C/jj to denote CA{Lt, t, x, y) and CA{Rt,t, y), respectively. The lemma below states that 
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Figure 4: Examples of z-right sets. 



(Cijj, respectively) can be determined by Cl^^-^ {Cr^_-^^, respectively) and dp messages generated 
by some vertices in Rt~i {Lt~i respectively) at time t. It essentially follows from Observation 13.31 
and an observation that there are at most dP edges linking between vertices in y \ Rt-i iy \ Lt-i 
respectively) and vertices in Rt {Lt respectively). 

Lemma 3.4. Fix any deterministic algorithm A and input strings x and y. For any < t < 
(d^ — 1) /2, there exist functions gi and gn, B-hit messages M^^^^ , ■ ■ ■ , ^dp~^ •'^^^ ^^me vertices 
in Lt-i at time t, and B-hit messages M^^ . . ., M^^'^ sent by some vertices in Rt-i at time t 
such that 

CL,=gL{Cu,,,M^'-\...,M^;-'), and (1) 

CR,=gR{CR,_,,M\:'-\...,M^;-'). (2) 

Proof. We prove Eq. ([2]) only. (Eq. ([1]) is proved in exactly the same way.) First, observe the 
following facts about neighbors of nodes in Rt- 

• All neighbors of all path vertices in Rt are in Rt-i- Example: In Figure [3?2t path vertices 
in R2 are f2,...,w^_x for £ = 1, . . . , F. Observe that all neighbors of these vertices, i.e. 
v{, . . . , w^p„i for all I and u\, . . . , are in Ri. 

• All neighbors of all leaf vertices in V{T) ORt are in Rt-i- Example: In Figure [3^2] vertices in 
i?2 are tig, . . . , u^_^. Their neighbors, i.e. ■ ■ ■ , v^_-^ for all £ and uf""*^, . . . , are 
all in Ri. 

• For any non-leaf tree vertex uf, for any i and i, if is in Rt then its parent and vertices 

ti^_|_2, • • • , are in Rt-i- Example: In Figure [121 ^ is in R2. Thus, its parent 

(iiQ~^) and U2~^, . . . , are in Ri. 
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• For any i and £, if uf is in Rt then all children of uf_^_^ are in Rt (otherwise, all children of uf 
are not in Rt and so is u^, a contradiction). Example: In Figure [3^ is in R2- Thus, all 
children of uf"^ ™ -^2- 

Let u^{Rt) denote the leftmost vertex that is at level £ of T and in Rt, i.e., u^{Rt) = 
where i is such that G Rt and ^ i?f (For example, in Figure [3^ uP^^{Ri) = Uq^^ and 
nP~^{R2) = Ui ^ ■) From the above observations, we conclude that the only neighbors of nodes in 
Rt that are not in Rt-i are children of u^{Rt), for all £. In other words, all edges linking between 
vertices in Rt and V \ Rt-i are in the following form: {u^[Rt),u') for some £ and child u' of u^{Rt). 

Setting U' = Rt-i and U = Rt in Observation [331 we have that C^^ can be uniquely determined 
by Cijj_-^ and messages sent to u^{Rt) from its children in V\Rt-i ■ Note that each of these messages 
contains at most B bits since they correspond to a message sent on an edge in one round. 

Observe further that, for any t < {dP — l)/2, V \ Rt-i ^ Lt-i since Lt-i and Rt-i share some 
path vertices. Moreover, each u^{Rt) has d children. Therefore, if we let M^*~^ , . . . , M^^*"^ be the 
messages sent from children of (Rt) , {Rt) , • • • , uP~^{Rt) in y \ Rt-i to their parents (note that 
if there are less than dp such messages then we add some empty messages) then we can uniquely 
determine Cr^ by CRt_^ and M^'-\ . . .,M^^'\ Eq. ^ thus follows. □ 

Using the above lemma, we can now prove Theorem 13. 1[ 
3.4 Proof of the Simulation Theorem (cf. Theorem 13.11) 

Let / be the function in the theorem statement. Let Ae be any e-error distributed algorithm for 
computing / on network G(T,d,p). Fix a random string f used by Ae (shared by all vertices in 
G(r, d,p)) and consider the deterministic algorithm A run on the input of Ae and the fixed random 
string f. Let T4 be the worst case running time of algorithm A (over all inputs). We note that 
Tj^ < (d^ — l)/2, as assumed in the theorem statement. We show that Alice and Bob, when given f 
as the public random string, can simulate A using at most 2dpBT_A communication bits, as follows. 

Alice and Bob make T^^ iterations of communications. Initially, Alice computes Cl^ which 
depends only on x. Bob also computes Cr^ which depends only on y. In each iteration t > 0, 
we assume that Alice and Bob know Cl^_^ and Cr^_-^, respectively, before the iteration starts. 
Then, Alice and Bob will exchange at most 2dpB bits so that Alice and Bob know Cl^ and Cr^, 
respectively, at the end of the iteration. 

To do this, Alice sends to Bob the messages . . . ,M^^^^ as in Lemma 13.41 Alice can 

generate these messages since she knows CLf_-^ (by assumption). Then, Bob can compute Cr^ 
using Eq. ([2]) in Lemma [3. 4[ Similarly, Bob sends dp messages to Alice and Alice can compute Cl^. 
They exchange at most 2dpB bits in total in each iteration since there are 2dp messages, each of 
B bits, exchanged. 

After T4 iterations, Alice knows C{LTj^,Tj(,x,y) and Bob knows C{Rt^,Tj[, x,y). In particu- 
lar, they know the output of A (output by s and r) since Alice and Bob know the states of s and 
r, respectively, after A terminates. They can thus output the output of A. 

Since Alice and Bob output exactly the output of A, they will answer correctly if and only if A 
answers correctly. Thus, if A is e-error then so is the above communication protocol between Alice 
and Bob. Moreover, Alice and Bob communicate at most 2dpBTj^ bits. The theorem follows. 



17 



4 Distributed Verification of Set Disjointness and Equality Func- 
tions 

In this section, we show lower bounds of distributed algorithms for verifying set disjointness and 
equality. The definitions of both problems can be found in Section [2.21 and the model of distributed 
verification of functions can be found in Section [2]3l The results in this section are simple corollaries 
of the Simulation Theorem (cf. Theorem l3.1|) and will serve as important building blocks in showing 
lower bounds in later sections. 



4.1 Randomized Lower Bound of Set Disjointness Function 

To prove the lower bound of verifying disj, we simply use the communication complexity lower 
bound of computing disj [U [HI [H |31], i.e., i?e^~^"*(disj) = 0(6) where b is the size of input 
strings x and y. 

Lemma 4.1. For any T,d,p, there exists a constant e > such that 

where b is the size of input strings x and y of disj; i.e., any e-error algorithm computing function 
disj on G{T,d,p) requires r2(min(#, 3^)) time. 

Proof URe'-^''^'^\d±s2) > {dP-l)/2 then i?f ^'^''^'^^(disj) = n{dP) and we are done. Otherwise the 
conditions of Theorem [Pare fulfilled and it implies that ~*'"^(disj) < 2dpB ■ Rf^^''^'^\disj). 
Now we use the fact that i?e'^~^"''(disj) = $7(6) for the function disj on 6-bit inputs, for some e > 
[D O m EI] (also see [201 Example 3.22] and references therein). It follows that Rf^^''^'^\dis2) = 
n{b/{dpB)). □ 



4.2 Deterministic Lower Bound of Equality Function 

To prove the lower bound of verifying eq, we simply use the deterministic communication complexity 
lower bound of computing eq [S^, i.e., i?Q'^~^"^(eq) = $1(6) where 6 is the size of input strings x 
and y (see, e.g., \20\ Example 1.21] and references therein). 



Lemma 4.2. For any r,d,p, 



R^^^'''^\eq) = n{mm{dP,^)), 



where 6 is the size of input strings x and y of eq; i.e., any deterministic algorithm computing 
function eq on G{T,d,p) requires r2(min(d^, ^j^)) time. 

Proof If i?^^^''''^^(eq) > {dP - l)/2 then i?^^^''''^^ (eq) = n{dP) and we are done. Otherwise, the 
conditions of Theorem 13.11 are fulfilled and it implies that i?Q^~^"''(eq) < 2dpB • i?^^'"''^'^^(eq). 
Now we use the fact that iiQ'^~^"^(eq) = $1(6) for the function eq on 6-bit inputs. It follows that 
Rf^^''^'P\eq) = n{b/{dpB)). □ 
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5 Randomized Lower Bounds for Distributed Verification 



In this section, we present randomized lower bounds for many verification problems on graphs of 
various diameters, as shown in Figure [TJ These problems are defined in Section 12.41 The key 
ingredient is the lower bound of verifying the set disjointness function on distributed networks (cf. 
Lemma l4.ip . The general theorem is as follows. 

Theorem 5.1. For any p > 1, B > 1, and n £ {2'^'P~^^pB,3^P^^pB, . . .}, there exists a con- 
stant e > such that any e-error distributed algorithm for any of the following problems requires 
1 1 

i}{{n/{pB))^ 2(2p+i) ) time on some Q{n)-vertex graph of diameter 2p + 2 in the B model: Span- 
ning connected subgraph, cycle containment, e-cycle containment, bipartiteness, s-t connectivity, 
connectivity, cut, edge on all paths, s-t cut and least-element list. 

In particular, for graphs with diameter D = 4, we get Q{{n/ B)^^^) lower bound and for graphs 
with diameter D = logn we get rt{^n/{B logn)). Similar analysis also leads to a il.{y^n/B) lower 
bound for graphs of diameter for any 6 > 0, and i}{{n/B)^^^) lower bound for graphs of diameter 
three using the same analysis as in [10]. We note again that the lower bound holds even in the 
public coin model where every vertex shares a random string. 

Organization This section is organized as follows. In the first three subsections, we show lower 
bounds that need a reduction from the set disjointness problem (i.e., problems in the third column in 
Figure [2]) : spanning connected subgraph verification in Subsection 15. H s-t connectivity verification 
in Subsection 15.21 and cycle containment, e-cycle containment, and bipartiteness verification in 
Subsection 15.31 (these problems are proved together as they use the same construction) . The lower 
bounds on the remaining problems (connectivity, cut, edges on all paths, s-t cut and least-element 
list verification) are in Subsection 15. 4[ 

5.1 Lower Bound of Spanning Connected Subgraph Verification Problem 

The lower bound of spanning connected subgraph verification essentially follows from the following 
lemma which says that an algorithm for solving spanning connected subgraph verification can be 
used to compute disj as well. 

Lemma 5.2. For any T , d > 2, p and e > 0, if there exists an e-error distributed algorithm for the 
spanning connected subgraph verification problem on graph G{T,d,p) then there exists an e-error 
algorithm for verifying disj (on T-bit inputs) on G{T,d,p) that uses the same time complexity. 

Proof. Consider an e-error algorithm A for the spanning connected subgraph verification problem, 
and suppose that we are given an instance of the set disjointness problem with F-bit input strings 
x and y (given to s and r). We use A to solve this instance of the set disjointness problem by 
constructing H as follows. 

First, we mark all path edges and tree edges as participating in H. All spoke edges are marked 
as not participating in subgraph H, except those incident to s and r for which we do the following: 
For each bit Xi, 1 < i < T, vertex s indicates that the spoke edge {s,Vq) participates in H if and 
only if Xi = 0. Similarly, for each bit yi, 1 < i < F, vertex r indicates that the spoke edge (r, v^^_^) 
participates in H if and only if yi = 0. (See Figure [5l) 

Note that the participation of all edges, except those incident to s and r, is decided indepen- 
dently of the input. Moreover, one round is sufficient for s and r to inform their neighbors of 
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Figure 5: Example of H for the spanning connected subgraph problem (marked with dashed edges 
(red edges)) when x = 0...10 and y = 1...00. 

the participation of edges incident to them. Hence, one round is enough to construct H. Then, 
algorithm A is started. 

Once algorithm A terminates, vertex r determines its output for the set disjointness problem 
by stating that both input strings are disjoint if and only if the spanning connected subgraph 
verification algorithm verified that the given subgraph H is indeed a spanning connected subgraph. 

Observe that H is a spanning connected subgraph if and only if for all 1 < i < F at least one of 
the edges {s,Vq) and (r, t'^p_^) is in H; thus, by the construction of H, H is a spanning connected 
subgraph if and only if the input strings x,y are disjoint, i.e., for every i either Xj = or = 0. 
Hence the resulting algorithm has correctly solved the given instance of the set disjointness problem 
when A correctly solve the spanning connected subgraph verification problem on the constructed 
subgraph H. This happens with probability at least 1 — e. □ 

Using Lemma 14.1^ we obtain the following result. 

Corollary 5.3. For any T,d,p, there exists a constant e > such that any e-error algorithm 
for the spanning connected subgraph verification problem requires i}{min{d^, ■^^)) time on some 
Q(TdP)-vertex graph of diameter 2p + 2. 

In particular, if we consider T = d^~^^pB then Q{min{d'P ,r / (dpB))) = ^l{d^). Moreover, by 

Lemma 13.21 G{d^^^pB,d,p) has n = Q{d^P^^pB) vertices and thus the lower bound of Q.{d^^) 
1 1 

becomes n{{n/{pB))'^ 2{2p+i)) Theorem EH (for the case of spanning connected subgraph) follows. 

5.2 Lower Bound of s-t Connectivity Verification Problem 

We again modify the proof of Lemma 15.21 to prove the following lemma. 
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Lemma 5.4. For any T , d > 2, p and e > if there exists an e-error distributed algorithm for the 
s-t connectivity verification problem on graph G{T,d,p) then there exists an e-error algorithm for 
verifying disj (on T-bit inputs) on G{T,d,p) that uses the same time complexity. 

Proof. We use the same argument as in the proof of Lemma 15.21 except that we construct the 
subgraph H as fohows. 

First, ah path edges are marked as participating in subgraph H. All tree edges are marked as 
not participating in H. All spoke edges, except those incident to s and r, are also marked as not 
participating. For each bit x^, 1 < i < F, vertex s indicates that the spoke edge {s,Vq) participates 
in H if and only if Xi = 1. Similarly, for each bit t/j, 1 < ^ < F, vertex r indicates that the spoke 
edge (r, 't;^p_^) participates in H if and only if yi = 1. (See Figure [H) 

Observe that s and r are connected in H if and only if there exists 1 < i < F such that both 
edges {vq,s), (t'^_i,r) are in H; thus, by the construction of H, H is s-r connected if and only if 
the input strings x and y are not disjoint. □ 

5.3 Lower Bounds of Cycle Containment, e-Cycle Containment, and Bipartite- 
ness Verification Problems 

We modify the proof of Lemma 15.41 to prove the following lemma which says that an algorithm for 
solving problems in this section can be used to compute disj. 

Lemma 5.5. For any T , d > 2, p and e > if there exists an e-error distributed algorithm 
for solving either the cycle containment, e-cycle containment or bipartiteness verification problem 
on graph G{T,d,p) then there exists an e-error algorithm for verifying disj (on T-bit inputs) on 
G{T,d,p) that uses the same time complexity. 

Proof. We prove this lemma by modifying the proof of Lemma [5.4[ We only note the key difference 
here. 
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Figure 7: Example of H for the cycle and e-cycle containment and bipartiteness verification problem when x = 0...10 
and y = 1...00. 



Cycle containment verification problem: We construct H in the same way as in the proof of 
Lemma 15.41 except that the tree edges are participating in H (see Figure [T]) . 

In the case that the input strings are disjoint, H wih consist of the tree connecting s and r as 
well as 1) paths connected to s but not to r, 2) paths connected to r but not to s and 3) paths 
connected neither to r nor s. Thus there is no cycle in H. In the case that the input strings are not 
disjoint, we let i be an index that makes them not disjoint, that is Xi = yi = 1. This causes a cycle 
in H consisting of some tree edges and path "P* that are connected by edges {s,Vq) and (f^p_^,r) 
at their endpoints. Thus we have the following claim. 

Claim 5.6. H contains a cycle if and only if the input strings are not disjoint. 

e-cycle containment verification problem: We use the previous construction for H and let e be 
the tree edge adjacent to s (i.e., e connects s to its parent). Observe that, in this construction, H 
contains a cycle if and only if H contains a cycle containing e. Therefore, we have the following 
claim. 

Claim 5.7. e is contained in a cycle in H if and only if the input strings are not disjoint. 

Bipartiteness verification problem: Finally, we can verify if such an edge e is contained in a 
cycle by verifying the bipartiteness. First, we replace e = (s,Ug~^) by a path (s,f',Ug~^), where v' 
is an additional/virtual vertex. This can be done without changing the input graph G by having 
vertex s simulated algorithms on both s and v' . The communication between s and v' can be done 
internally. The communication between v' and Uq~^ can be done by s. We construct H' the same 
way as H with both (s,?;') and {v' ,u^~^) marked as participating. 

We observe that if the input strings are not disjoint, then either H or H' are not bipartite. To 
see this, consider two cases: when # is even and odd. When # is even and the input strings are 
not disjoint, there exists i such that there is a cycle in H consisting of some tree edges (including 
e) and path Pi that are connected by edges {s,Vq) and ("U^p^i,?") at their endpoints. This cycle is 
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of length 2p + — 1) + 2 - an odd number causing H to be not bipartite. If dP is odd, then by 
the same argument there is an odd cycle of length (2p + 1) + (d^ — 1) + 2 in H' (this cycle includes 
the edges {s,v') and {v',Uq~^) that replaces e); thus H' is not bipartite. 

Now we consider the converse: If the input strings are disjoint, then H does not contain a cycle 
by the argument of the proof of the cycle containment problem (which uses the same graph). It 
follows that H' does not contain a cycle as well. Therefore, we have the following claim. 

Claim 5.8. H and H' are both bipartite if and only if the input strings are disjoint. 

□ 

We note that the above reduction for the bipartiteness verification problem might seem to 
suggest that one can also prove the lower bound of this problem by reducing from the e-cycle 
verification problem. However, this is not the case. The reason is that the above proof relies on 
the fact that H and H' each contains at most one cycle and such cycle must contain e. In general, 
this might not be the case. 

5.4 Lower Bounds of Connectivity, Cut, Edges on All Paths, s-t Cut and Least- 
element List Verification Problems 

Lower bounds of verification problems in this section are proven using the lower bounds of problems 
in Section 15.11 15.21 and 15.31 

Connectivity verification problem We reduce from the spanning connected subgraph verifi- 
cation problem. Let A{G, H) be an algorithm that verifies if H is connected in 0(r(n)) time on any 
n-vertex graph G and subgraph H. Now we will use this algorithm to verify whether a subgraph 
H' of G is connected or not. 

Recall that, by definition, H' is a spanning connected subgraph if and only if every node is 
incident to at least one edge in H' and H' is connected. Verifying that every node is incident to 
at least one edge in H' can be done locally and all nodes can be notified if this is not the case in 
0(D) rounds (by broadcasting). Checking if H' is connected can be done in 0(T(n)) rounds by 
running A{G,H'). The total running time for checking if H' is a spanning connected subgraph is 
thus 0(r(n) + D). The lower bound of the spanning connected subgraph problem thus applies to 
the connectivity verification problem as well. 

Cut verification problem We again reduce from the spanning connected subgraph problem. 
Given a subgraph H, we verify if if is a spanning connected subgraph as follows. Let H be the 
graph obtained by removing edges E{H) of H from G. Recall that H \s a, spanning connected 
subgraph if and only if H is not a cut (see definition of a cut in Section [2.4p . Thus, we verify if H 
is a cut and announce that is a spanning connected subgraph if and only if H is not a cut. 

s-t cut verification problem We reduce from s-t connectivity. Similar to above, we use the 
fact that H is s-t connected if and only if H is not an s-t cut. 
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Least-element list verification problem We reduce from s-t connectivity. We set the rank 
of s to and the rank of other nodes to any distinct positive integers. We assign weight to all 
edges in H and 1 to other edges. Give a set S = {< s, >} to vertex t. Then we verify if S is the 
least-element list of t. Observe that if s and t are connected by H then the distance between them 
must be and thus S is the least-element list of t. Conversely, if s and t are not connected then 
the distance between them will be at least one and S will not be the least-element list of t. 

Edge on all paths verification problem We reduce from the e-cycle containment problem 
using the following observation: H does not contain a cycle containing e if and only if e lies on all 
paths between u and v in H where u and v are two nodes incident to e. 

6 Deterministic Lower Bounds of Distributed Verification 

In this section, we present deterministic lower bounds for Hamiltonian cycle, spanning tree and 
simple path verification. These problems are defined in Section [2.41 These lower bounds are proved 
in almost the same way as in Section [5j The only difference is that we reduce from the deterministic 
lower bound of the Equality problem (cf. Lemma l4.2p . 

Theorem 6.1. For any p, B > 1, and n G {I'^^'^^pB, S'^^^^pB, ...}, any deterministic distributed 

1 1 

algorithm for any of the following verification problems requires 2(2p+i) ^ time on some 

Q{n)-vertex graph of diameter 2p + 2 in the B model: Hamiltonian cycle, spanning tree, and simple 
path verification. 

We first prove the lower bound of the Hamiltonian cycle problem and later extend to other 
problems. 

6.1 Lower Bound of Hamiltonian Cycle Verification Problem 

We construct G(r,2,p)' from G(r,2,p) by adding edges and vertices to G(r,2,p). We argue that 
the Simulation Theorem (cf. Theorem 13. ip holds on G(r, 2,p)' as well. Note that since d = 2, T is 
a binary tree. Let m be — 1. 

First we add edges in such a way that the subgraph induced by the vertices {vq, . . . ,v^} is a 
clique and the subgraph induced by the vertices {f^j • • • ; ^ml is a clique as well. Observe that the 
Simulation Theorem holds on G(r,2,p) with this change since it only relies on Lemma 13.41 which 
is true on the modified graph as well. 

Now we add edges (n^, u^j^^) for all < i < m — 1. Thus we shorten the distance between each 
pair of nodes and u^^^ from at most three (using two spoke edges and one path-edge from the 
according V^) to one (red dashed edges in Figure [8]). This will affect the lower bound by at most a 
constant factor. Now for each node u\ we add a path of length p — l + 1 containing p — I new nodes 
connecting u\ to u^^,^-i_^^ (green dotted paths/nodes in Figure [8]). This will increase the number 
of messages needed in Lemma 13.41 bv a factor of two. Thus, the Simulation Theorem still holds. 

Further, we add the edges {v^\,u^), {v^^,u^) and {v^^Uq). This will again increase the 
number of messages needed in Lemma 13.41 by a constant factor of two and thus the Simulation 
Theorem still holds. 
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Finally, we add the following three edges {uq,vI^), {s,v^^^), and {s,v^^j^). Adding these three 
edges will increase the number of messages needed in Lemma 13.41 by at most three and thus the 
Simulation Theorem still holds. This completes the description of G(r,2,p)'. 

,,0 




Figure 8: Example of the modification of the tree-part of G{T,2,p) in the case p = 4. The red 
dashed edges are new edges {uf,u^^j^) and the green dotted edges form new paths between nodes 
connecting ui to , , • 

To simplify and shorten the proof, we do some preparation. First, we consider strings x and y 
of length h and define F to be 2 + 126 - this changes the bound only by a constant factor. Now, 
from X and y, we construct strings of length m (we assume m to be even) 

x' := 1x101x101x201x201 . . . OlxbOlxfeOlxTOlxTOl . . . 01x^01x^010, 
y' := lyiOlyiOlyaOlyaOl . . . OlybOlyfeOlylOlylOl . . . Oly^Oly^OlO 

where x7 and denote negations of Xj and yi respectively. 

Now we construct H in five stages. In the first stage we create some short paths that we call 
lines. In the next two stages we construct from these lines two paths Si and S2 by connecting the 
lines in special ways with each other (the connections depend on the input strings). In the fourth 
stage we construct a path 5*3 that will connect the leftover lines with each other. These three paths. 
Si, S2, and 53, will cover all nodes. The final stage is to connect all three paths together. 

We will construct these paths in such a way that, the input strings are equal if and only if the 
resulting graph H is a Hamiltonian cycle. Later we will observe that in the case the strings are 
equal all three paths will look like disjoint paths when using the graph layout of Figure [3l The 
formal description of the five stages will be accompanied by a small example in Figures [9] and [TOl 
Here, x = 01 = y. 

Stage 1 We create the lines by marking most path edges (to be more precise, all edges {Vj,Vj_^_-^) 
for all i G [1, F] and j G {2, . . . ,m — 2} for j G {1, . . . ,m — 1}) as participating in subgraph H. In 
addition we add the edges (^^m-i'^m) ("^O'^i) ^° ^- These basic elements are called lines now 
(see Figure [9]). 

Stage 2 Define path Si as follows. All spoke edges incident to Ti^ are marked as not participating 
in H, except those incident to s and r. For each bit x^, 1 < i < F, vertex s indicates that the edge 
(^O'^o^^) participates in H if and only if x^ = 1. Similarly, for each bit y^, 1 < i < F, vertex r 
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indicates that the spoke edge participates in H if and only if y'- = 0. Furthermore for 

2 < i < r each edge {vq, v\) participates in H if and only if x'j^_^ 7^ x[. Similarly for 2 < i <T each 
edge ('^m-i'^rn) participates in H if and only if y'-_^ 7^ y[. In addition we let edges {vq,vI) and 
('Wm-ii'^^m) participate in H. We denote the path that results from connecting the lines according 
to the rules above by Si. An example is given in Figure [9j 



Stage 1 

vqO O- O Ovl^ 

foO O O Ovjs 

VqO O O Ovls, 

v^O O O Ovt, 

v^O O O Ovls 

v'oO O O Ovl 

foO o o OfL 

^^oO O O Ovfs 

v'oO O O Ovl, 

CO O O Ovl° 

vl'O O O Ovll 

vl^O O O Ovll 

vl'O O O Ovll 

vl^O O O Ovll 

v'o'O O O Ovll 

vl^'O O O Ovll 

vl'O O O Ovll 

vl'^O O O Ovll 

t'o'O O O Ovll 

vl°0 O O Ovll 

vi'o o o Ovll 

vl^O O O Ovll 

vl'O O O Ovll 

vi^o o o Ovll 

vfO O O Ovll 

v^o'^O-O O-Ovll 

"^1 ''14 



Stage 2 
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Xl=Xr = 



X2 



X2 
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X\ = X-^^ : 



X' 



16 
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: X. 
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^21 ■ 
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T2 = 2:23 

■ 

2^25 ■ 
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Figure 9: Example of the reduction using input strings x = 01 = y. Note that F is 12 • 2 + 2 = 26 
and we use d = 2 and p = 4. In Stage 1, we add blue paths to i/, that are displayed by blue dashed 
lines. In Stage 2, we create 5*1, the red-colored path that looks like a snake. 



Stage 3 Define S2 as follows. We connect the lines not covered in Stage 3 (except the tree T) 
and those nodes that are not covered by any path or line. In particular, on the left side of the 
graph, for < i < 26, we do the following. 

• If Xg^gj = (and thus x'^^q^ = due to the definition of x') then edges (v^"*"^*, t;Q^^*), 
(u^+^Sv^+^*) and {vl+^\vl-^^') are indicated to participate in H. 
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• If = 1 (and thus x'^^^^.- = 1 due to the definition of x'), edges {v^^\v^^^l), {vl+^\ v\-^^') 
and {v^^l.v'^^'') will participate in H. 

On the right side of the graph, for < i < 26 we indicate the following edges to participate in H: 



• 




and y^_^g(._ 


hi) 


= 


• 




and y^^g(^^ 


hi) 


= 1 


• 


^,,6+6i 2+6{i+l)^ .r , _ 
\V^_l,Vm > 11 y^+Gi — 


1 and y^^g(^^ 


hi) 


= 


• 




1 and 


hi) 


= 1 



We denote the path that results from connecting lines according to the rules above by 82- An 
example is given in Figure [TOl 

Stage 4 We include edges of the modified tree in a canonical way in H to form a path S3 that 
covers all nodes in T and starts and ends at Uq and Uq respectively, as follows. For all odd i such 
that < i < m — 1, we include the edge {u^, in H. For all < I < p — 1 and all < i < 
we include the edges (u', ti'^^;^) in H, and if i is odd, we also include the path connecting u[ to 
■"i-dP-'+i ™ ^- example is given in Figure [TTJ 

Stage 5 We now connect end points of the three paths. Let us investigate the six endpoints of 
the three paths. 

• End points of S3 are Uq and Uq (see Figure [TT]l . 

• Path 52 has both endpoints on the r-side (see Figure [T0|) . Let us denote these endpoints by 
ei and 62- Depending on the input strings, endpoint ei is either or v'^ and the other 
endpoint 62 is either v^^^ or v^^^. 

• The endpoints of Si are both on the r-side (see Figure [9|): and w^. 
Now we connect those endpoints in the following way. 

• Connect Si and S2, each at one endpoint on the r-side, by letting edge {ei,vj^) participate 
in H. 

• We connect the endpoint Uq of 5*3 to the endpoint of Si by marking an edge {uq,v^) as 
participating. 

• Connect the endpoint 62 to v and v to the endpoint Uq of S3 by including the corresponding 
edges of G(r,2,p)' in H. 

An example is given in Figure [TOl 

If the strings are equal then the result is a Hamiltonian cycle since it contains the paths Si, 
S2, and 53 that will be three disjoint paths (connected to be a cycle) that cover all nodes. Now 
we prove that if the strings are not equal, H will not be a Hamiltonian cycle. Let i be a position 
in which X'^ and differ. Let us consider the case that Xi = and yi = 1. Then the sequence 
x'l^Qi, ■ ■ ■ ;2;g_|_gj will be 100100 while the sequence y^+gi' • ■ ■ '2/6+61 ^^^^ 110110. When we look 
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Stage 3 



Stage 5 




Figure 10: Continuation of the example started in Figure O In Stage 3, we add 5*2 in dotted green 
(dotted lines). S3 is displayed in dashed brown and added in stage four. Finally we connect Si, S2 
and 5*3 by bold blue edges to a Hamiltonian cycle in Stage 5. 



28 



Figure 11: The modified tree for p = 4 and d = 2. Tlie red bold edges form path S^. 
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Figm'e 12: Example of the case that = and yi = 1. 



at the part of the graph H corresponding to this sequence (see Figure [T2]l , we see that H can not 
be a cycle and thus not a Hamiltonian cycle: due to 2/2+61 ~ ^ ^2+6j ~ there are no edges 
on the s- nor r-side of level 2 + 6i connecting the part of Si below level 2 + 6i to the part of S\ 
above level 2 + Qi. There will also be no edges of ^2 that accidentally connect those two parts to 
each other. The case that Xj = 1 and = is treated the same way: due to the construction of x' 
and y' the sequence a;[ift_^j_^g., . . . , x^^_^g_^g. is 100100 and ygfe+i+ei' • • • ' ^66+6+6* ^'^^^ ^e 110110 and 
we can use exactly the same argument as before. 

Now consider an algorithm Aham for the Hamiltonian cycle verification problem. When Aham 
terminates, vertex s determines its output for the equality problem by stating that both input 
strings are equal if and only if Aham verified that H \s a. Hamiltonian cycle. 

Hence a fast algorithm for the Hamiltonian cycle problem on G(r, 2,p)' can be used to correctly 
solve the given instance of the equality problem on G(r,2,p)' faster. This contradicts the lower 
bound of the equality verification problem, which holds for all d (here we used d = 2). 

6.2 Lower Bound of Spanning Tree and Path Verification Problems 

The remaining two deterministic lower bounds follow from the lower bound of the Hamiltonian 
cycle verification problem, as follows. 

Spanning Tree Verification Problem We reduce Hamiltonian cycle verification to spanning 
tree verification using 0{D) rounds using the following observation: H \s a, Hamiltonian cycle if 
and only if every vertex has degree exactly two and H \e, for any edge e in H, is a spanning tree. 

Therefore, to verify that H is a Hamiltonian cycle, we first check whether every vertex has 
degree exactly two in H. If this is not true then H is not a Hamiltonian cycle. This part needs 
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0{D) rounds. Next, we check \i H \ {e}, for any edge e in is a spanning tree. We announce 
that H \s a Hamiltonian cycle if and only if if \ {e} is a spanning tree. 

Simple Path Verification Problem Similar to the above proof, we reduce Hamiltonian cy- 
cle verification to path verification using 0{D) rounds using the following observation: H \s a, 
Hamiltonian cycle if and only if every vertex has degree exactly two and H\e \s a path (without 
cycles) . 

7 Hardness of Distributed Approximation 

In this section we show a time lower bound of Q.{^Jn/ {B\ogn)) for Monte Carlo randomized 
approximation algorithms of many problems (defined in Section [2.5p . as in the following theorem. 

Theorem 7.1. For any polynomial function a{n), numbers p, B > 1, andn £ {2'^P~^^pB, S'^^^^pB, . . .}, 

there exists a constant e > such that any a{n)- approximation e-error distributed algorithm for any 

1 1 

of the following problems requires il((^)2 2(2p+i) ^ time on some Q{n)-vertex graph of diameter 
2p+2 in the B model: minimum spanning tree \1(^ \29 ^ . shallow-light tree ]28^, s-source distance f^, 
shortest path tree minimum routing cost spanning tree Jg<5j/ . minimum cut Q/, minimum s-t 

cut, shortest s-t path and generalized Steiner forest ]15^ - 

In particular, for graphs with diameter = 4, we get il((n/i?)^/^) lower bound and for graphs 
with diameter D = logn we get ^{^n/{B logn)). Similar analysis also leads to a ^{\/n/ B) lower 
bound for graphs of diameter for any (5 > 0, and r2((n/i?)^/^) lower bound for graphs of diameter 
three using the same analysis as in [lOj . 

The main tool used in this section is the randomized lower bound of network verification prob- 
lems defined in Section 12.41 and proved in Section [5l 

The main proof idea is similar to the proof that the Traveling Salesman Problem on general 
graphs cannot be approximated within a{n) for any polynomial computable function a{n) (see, 
e.g., [HI): We will define a weighted graph G' in such a way that if the subgraph H satisfies the 
desired property then the approximation algorithm must return some value that is at most /(n), 
for some function /. Conversely, if H does not satisfy the property, the approximation algorithm 
will output some value that is strictly more than /(n). Thus, we can distinguish between the two 
cases. 

To highlight the main idea, we first prove the theorem for the minimum spanning tree problem 
in the next subsection. Proofs of other problems are in Subsection 17. 2[ 

7.1 Lower Bound of Approximating the Minimum Spanning Tree 

Recall that in the minimum spanning tree problem, we are given a connected graph G and we want 
to compute a minimum spanning tree (i.e., a spanning tree of minimum weight). At the end of the 
process each vertex knows which edges incident to it are in the output tree. 

Let be an a(n)-approximation e-error algorithm for the minimum spanning tree problem. 
We show that Ae can be used to solve the spanning connected subgraph verification problem using 
the same running time (thus the lower bound proved in Theorem 15.11 applies) . 

To do this, construct a weight function on edges in G, denoted by w, by assigning weight 1 to 
all edges in H and weight na{n) to all other edges. Note that constructing u does not need any 
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communication since each vertex knows which edges incident to it are members of H. Call this 
weighted graph G'. Now we find the weight W of the a(n)-approximated minimum spanning tree 
of G' using Ae and announce that H is a spanning connected subgraph if and only if W is less than 
na{n). 

We will show that the weighted graph G' has a spanning tree of weight less than na{n) if and 
only if is a spanning connected subgraph of G and thus the algorithm above is correct. Suppose 
that if is a spanning connected subgraph. Then, there is a spanning tree that is a subgraph of 
H and has weight n — 1 < na(n) since a{n) > 1. Thus the minimum spanning tree has weight 
less than na{n). Conversely, suppose that H is not a spanning connected subgraph. Then, any 
spanning tree of G' must contain an edge not in H. Therefore, any spanning tree has weight at 
least na(n) as claimed. 

7.2 Lower Bounds of Other Problems 

We now prove the remaining lower bounds. 

Shallow-light tree problem The lower bound for the shallow-light tree problem follows imme- 
diately from the lower bound of the MST problem when we set the length of every edge to be one 
and radius requirement to be n. In this case, the spanning tree satisfies the radius requirement and 
so the minimum-weight shallow-light tree becomes the minimum spanning tree. 

s-source distance and shortest path tree problems We construct the graph G' as in Sub- 
section 17.11 and the lower bounds follow in a similar way: H is a spanning connected subgraph 
if and only if the distance from s to every node is at most n — 1 (i.e., A has approximated the 
distance to be at most (n — l)a(n)), which is true if and only if the shortest path spanning tree 
contains only edges of weight one (i.e., the total weight of the shortest path spanning tree is at 
most (n — l)a(n)). 

Minimum routing cost spanning tree problem We modify the construction of G' in Subsec- 
tion [7T] as follows. We assign weight one to edges in H and n^a(n) to other edges. Observe that 
if ii is a spanning connected subgraph, the routing cost between any pair will be at most n — 1 
and thus the cost of the a(n)-approximation minimum routing cost spanning tree will be at most 
(n — l)(^^^a{n) < n^a{n). Conversely, if H is not a spanning connected subgraph, some pair of 
nodes will have routing cost at least n'^a(n) and thus the minimum routing cost spanning tree will 
cost at least n^a{n). 

Minimum cut problem We define G' by assigning weight one to all edges in ^ = (V, E{G) \ 
E{H)) and na{n) to all other edges and use the fact that H is a cut if and only if G' has a minimum 
cut of weight at most n — 1, i.e., A outputs a value of at most [n — l)a(n). 

Minimum s-t cut problem The reduction is the same as in the case of the minimum cut 
problem. Observe that s and t are not connected in H if and only ii H is a s-t cut which in turn 
is the case if and only if G' has a minimum s — t cut of weight n — 1. Thus, the lower bound of s-t 
cut verification problem implies the lower bound of this problem. 
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Shortest s-t path problem We again construct G' as in Subsection 17.11 Observe that s and t 
are in the same connected component in H if and only if the distance between s to i in G' is at 
most n — 1, i.e., A outputs a value of at most (n — l)a(n). The lower bound follows from the lower 
bound of s-t connectivity verification problem. 

Generalized Steiner forest problem We will reduce from the lower bound of s-t connectivity. 
We have only one set Vi = {s,r}. Construct G' as in Subsection 17.11 Observe that the minimum 
generalized Steiner forest will have weight at most n — 1 if is s-t connected and weight at least 
na{n) otherwise. (Recall that G is assumed to be connected in the problem definition.) 

8 Tightness of Lower Bounds of Verification Problems 

We note that almost all lower bounds of verification problems stated so far are almost tight. To show 
this we will present deterministic 0{^/nlog* n + Z))-time algorithms for all verification problems 
except the least-element list verification problem. This upper bound almost matches the ^}{^/n) 
lower bounds shown in previous sections. Our main tool is the MST algorithm by Kutten and 
Peleg |21j and the connected component algorithm by Thurimella \33\ Algorithm 5]. 

Recall that in the MST problem, we are given a weighted network G (that is the weight of each 
edge is known to the nodes incident to it) and we want to find a minimum spanning tree (for each 
edge e, nodes incident to it know whether e is in the MST or not.) Kutten and Peleg [21] showed 
that this problem can be solved by a 0(-y/n log* n + D)-time distributed deterministic algorithm. 

We also note the following connected component algorithm by Thurimella [53]. Given a sub- 
graph H of G, the algorithm outputs a label i{v) for each node v such that for any two nodes u 
and V, £{u) = i{v) if and only if u and v are in the same connected component. Theorem 6 in |33j 
states that the distributed time complexity of this algorithm is 0{D + /(n) + g{n) + ^/n) where 
f{n) and g{n) are the distributed time complexities of finding a MST and a i/n -dominating set, 
respectively. Due to [21] we have that /(n) = g{n) = 0{D + y^log* n). 

We are now ready to present algorithms for our verification problems. 

Spanning connected subgraph, spanning tree, cycle containment and connectivity ver- 
ification problems We construct a weighted graph G' by assigning weight zero to all edges in 
H and weight one to other edges (for each edge e, nodes incident to it know its weight). Observe 
the followings. 

• if is a spanning tree if and only if the MST of G' has cost zero. 

• H \s a. spanning connected subgraph if and only if the MST of G' has cost zero. 

• H contains no cycle if and only if all edges in H are in the MST of G' , i.e., the cost of the 
MST of G' is n — 1 — \E{H)\ where \E[H)\ is the number of edges in H. 

• H is connected if and only if there are |y(ii)| — 1 edges in the MST that have cost zero, 
where V{H) is the set of nodes incident to some edges in H. This is because all edges in a 
spanning forest of H can be used in the MST and there are less than V{H) — 1 such edges if 
and only if H is not connected. 

Thus, we can verify these properties of H by finding a minimum spanning tree of G' using the 
0{^/nlog* n + L')-time algorithm of Kutten and Peleg }21j . 
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Cut verification problem To verify if is a cut, we simply verify if G after removing the edges 
in H, i.e. H' = {V,E{G) \ E{H)), is connected. 

s-t connectivity verification problem We simply run Thurimella's algorithm (as explained 
above) and verify whether s and t are in the same connected component by verifying whether 

e{s) = m- 

Edge on all paths verification problem Observe that e lies on all paths between u and v if 
and only if u and v are disconnected in H \ {e}. Thus, we can use the s-t connectivity verification 
algorithm above to check this. 

s-t cut verification problem To verify if is a s-t cut, we simply verify s-t connectivity of G 
after removing the edges in H (i.e., H' = {V,E{G) \ E{H))). 

e-cycle verification problem To verify if e is in some cycle of H, we simply verify s-t connec- 
tivity of H' = H \ {e} where s and t are the end nodes of e. 

Bipartiteness verification problem We run Thurimella's algorithm to find the connected com- 
ponents of H. We note that this algorithm can in fact output a rooted spanning tree of each 
connected component of H and make each node knows its level in such a tree. This level implies 
a natural two-coloring of nodes in H. Now all nodes check if their neighbors have a color different 
from their own color. They will have a different color if and only if H is bipartite. 

9 Conclusion 

We initiate the systematic study of verification problems in the context of distributed network 
algorithms and present a uniform lower bound for several problems. We also show how these 
verification bounds can be used to obtain lower bounds on exact and approximation algorithms for 
many problems. Our techniques exploit well-known bounds in communication complexity to show 
lower bounds in distributed computing. Our techniques give a general and powerful methodology 
for showing non-trivial lower bounds for various problems in distributed computing. 

Several problems remain open. A general direction for extending all of this work is to study 
similar verification problems in special classes of graphs, e.g., a complete graph. A few specific open 
questions include proving better lower or upper bounds for the problems of shortest s-t path, single- 
source distance computation, shortest path tree, s-t cut, minimum cut. (Some of these problems 
were also asked in [7].) Also, showing randomized bounds for Hamiltonian path, spanning tree, 
and simple path verification remains open. 
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